Goto

Collaborating Authors

 cost-sensitive classification



Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper provides some theoretical analysis on Positive and Unlabeled Data learning (PU), when only the positive instances and unlabeled instances are available. The authors show that learning from PU is equivalent to a cost-sensitive classification task if the label prior is known. The main contribution of the paper is that, the authors show that using any convex loss leads to inconsistent classifier for PU tasks. Instead, using a non-convex ramp loss gives a consistent estimator. This theoretical justification is supported by experiments, which demonstrate that adopting hinge loss of SVM may result in very bad classification error comparing to using the non-convex ramp loss.




Optimizing F-Measures by Cost-Sensitive Classification

Shameem Puthiya Parambath, Nicolas Usunier, Yves Grandvalet

Neural Information Processing Systems

We present a theoretical analysis of F -measures for binary, multiclass and mul-tilabel classification. These performance measures are non-linear, but in many scenarios they are pseudo-linear functions of the per-class false negative/false positive rate. Based on this observation, we present a general reduction of F - measure maximization to cost-sensitive classification with unknown costs. We then propose an algorithm with provable guarantees to obtain an approximately optimal classifier for the F -measure by solving a series of cost-sensitive classification problems. The strength of our analysis is to be valid on any dataset and any class of classifiers, extending the existing theoretical results on F -measures, which are asymptotic in nature. We present numerical experiments to illustrate the relative importance of cost asymmetry and thresholding when learning linear classifiers on various F -measure optimization tasks.


Optimizing F-Measures by Cost-Sensitive Classification

Neural Information Processing Systems

We present a theoretical analysis of F-measures for binary, multiclass and multilabel classification. These performance measures are non-linear, but in many scenarios they are pseudo-linear functions of the per-class false negative/false positive rate. Based on this observation, we present a general reduction of F-measure maximization to cost-sensitive classification with unknown costs. We then propose an algorithm with provable guarantees to obtain an approximately optimal classifier for the F-measure by solving a series of cost-sensitive classification problems. The strength of our analysis is to be valid on any dataset and any class of classifiers, extending the existing theoretical results on F-measures, which are asymptotic in nature. We present numerical experiments to illustrate the relative importance of cost asymmetry and thresholding when learning linear classifiers on various F-measure optimization tasks.


Analyzing Cost-Sensitive Surrogate Losses via $\mathcal{H}$-calibration

Shah, Sanket, Tambe, Milind, Finocchiaro, Jessie

arXiv.org Artificial Intelligence

This paper aims to understand whether machine learning models should be trained using cost-sensitive surrogates or cost-agnostic ones (e.g., cross-entropy). Analyzing this question through the lens of $\mathcal{H}$-calibration, we find that cost-sensitive surrogates can strictly outperform their cost-agnostic counterparts when learning small models under common distributional assumptions. Since these distributional assumptions are hard to verify in practice, we also show that cost-sensitive surrogates consistently outperform cost-agnostic surrogates on classification datasets from the UCI repository. Together, these make a strong case for using cost-sensitive surrogates in practice.


Optimizing F-Measures by Cost-Sensitive Classification

Shameem Puthiya Parambath, Nicolas Usunier, Yves Grandvalet

Neural Information Processing Systems

We present a theoretical analysis of F -measures for binary, multiclass and multilabel classification. These performance measures are non-linear, but in many scenarios they are pseudo-linear functions of the per-class false negative/false positive rate. Based on this observation, we present a general reduction of F - measure maximization to cost-sensitive classification with unknown costs. We then propose an algorithm with provable guarantees to obtain an approximately optimal classifier for the F -measure by solving a series of cost-sensitive classification problems. The strength of our analysis is to be valid on any dataset and any class of classifiers, extending the existing theoretical results on F -measures, which are asymptotic in nature. We present numerical experiments to illustrate the relative importance of cost asymmetry and thresholding when learning linear classifiers on various F -measure optimization tasks.


Optimizing F-Measures by Cost-Sensitive Classification

Neural Information Processing Systems

We present a theoretical analysis of F-measures for binary, multiclass and multilabel classification. These performance measures are non-linear, but in many scenarios they are pseudo-linear functions of the per-class false negative/false positive rate. Based on this observation, we present a general reduction of F-measure maximization to cost-sensitive classification with unknown costs. We then propose an algorithm with provable guarantees to obtain an approximately optimal classifier for the F-measure by solving a series of cost-sensitive classification problems. The strength of our analysis is to be valid on any dataset and any class of classifiers, extending the existing theoretical results on F-measures, which are asymptotic in nature.


Optimizing F-Measures by Cost-Sensitive Classification

Neural Information Processing Systems

We present a theoretical analysis of F -measures for binary, multiclass and multilabel classification. These performance measures are non-linear, but in many scenarios they are pseudo-linear functions of the per-class false negative/false positive rate. Based on this observation, we present a general reduction of F - measure maximization to cost-sensitive classification with unknown costs. We then propose an algorithm with provable guarantees to obtain an approximately optimal classifier for the F -measure by solving a series of cost-sensitive classification problems. The strength of our analysis is to be valid on any dataset and any class of classifiers, extending the existing theoretical results on F -measures, which are asymptotic in nature. We present numerical experiments to illustrate the relative importance of cost asymmetry and thresholding when learning linear classifiers on various F -measure optimization tasks.